Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CP-53335, topology: do not raise exception when loading invalid distance matrices (NUMA) #6249

Merged
merged 9 commits into from
Feb 3, 2025

Conversation

psafont
Copy link
Member

@psafont psafont commented Jan 24, 2025

Unreachable nodes do not contain any CPUs, and therefore VCPUs cannot be
scheduled on them. They marked with a value of (2ˆ32) - 1. Instead of failing
to produce a NUMA object that allows for scheduling, create an object that
contains only schedulable NUMA nodes. This means changing how the
datastructures node_cpus and candidates are created to ignore the unreachable
ones.

Fixes two minor potential issues (distances being NaN, and adding duplicates to candidates, which adds to running time); and separates the xenopsd unit tests into 3 binaries for ease of testing.

Fixes #6240

@psafont psafont requested a review from edwintorok January 24, 2025 15:19
@psafont psafont force-pushed the private/paus/numaybe branch 2 times, most recently from d70811d to 8d287f5 Compare January 24, 2025 15:30
Copy link
Contributor

@lindig lindig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some small comments; I did not look at the overall logic

ocaml/xenopsd/lib/topology.ml Show resolved Hide resolved
ocaml/xenopsd/lib/topology.ml Outdated Show resolved Hide resolved
ocaml/xenopsd/lib/topology.ml Outdated Show resolved Hide resolved
@psafont psafont force-pushed the private/paus/numaybe branch 2 times, most recently from d4642ff to 619f8ba Compare January 27, 2025 10:56
Copy link
Contributor

@contificate contificate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not overly familiar with NUMA, but the changes make sense to me (especially if they fix the cited issue).

Only minor nitpicks:

  • Unrelated to this change: CPUSet could just include Set.Make(Int).
  • The Array.for_all check of the row shadows d for a different semantic purpose.

@psafont psafont force-pushed the private/paus/numaybe branch from 619f8ba to 2735b54 Compare January 30, 2025 11:02
@psafont psafont force-pushed the private/paus/numaybe branch from 2735b54 to e2d7d75 Compare February 3, 2025 11:50
let self_distance = d.(i).(i) in
(distance_to_candidate self_distance, Seq.return i)
)
in
let numa_nodes = Array.length d in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could count how many reachable NUMA nodes we have, then we can avoid unreachable nodes artificially triggering the 16 node limit.

We should probably also log when the limit got triggered, so we can debug unexpected cases.

Although that can come as a separate PR, this is already a good improvement.

valid_nodes
|> seq_all_subsets
|> Seq.filter_map (node_distances d)
|> seq_append single_nodes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the single nodes were always appended just in case something goes wrong with the filtering algorithm (which was meant to be somewhat smarter than brute force, or it may evolve in the future to be somewhat smarter).

As the algorithm currently looks like I agree that we don't need to append the single nodes here. Perhaps this would be a good condition to test for in the testcases, that single (reachable) nodes are always present with the expected value (unless such a test already exists).

Copy link
Contributor

@edwintorok edwintorok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, some suggestions on how the filtering could be extended to avoid artificially triggering the 16 NUMA node limit when not needed (although I don't think we currently have a system like that to test on, so this is purely theoretical, and something for the unit tests).

This allows to test independent modules faster more easily

Signed-off-by: Pau Ruiz Safont <[email protected]>
…nce matrices

Instead disable NUMA for the host

Fixes xapi-project#6240

Signed-off-by: Pau Ruiz Safont <[email protected]>
Now the tests use the actual data from the test specifications instead of being
hardcoded, and the distance matrices used for testing are in its own module for
better clarity.

Signed-off-by: Pau Ruiz Safont <[email protected]>
…able

Unreachable nodes do not contain any CPUs, and therefore VCPUs cannot be
scheduled on them. They marked with a value of (2ˆ32) - 1. Instead of failing
to produce a NUMA object that allows for scheduling, create an object that
contains only schedulable NUMA nodes. This means changing how the
datastructures node_cpus and candidates are created to ignore the unreachable
ones.

Signed-off-by: Pau Ruiz Safont <[email protected]>
These could be created accidentally by dividing by 0.

Signed-off-by: Pau Ruiz Safont <[email protected]>
It's unclear why the candidates with single nodes where always added, since the
algorithm that generates all the subsets already includes these.

Signed-off-by: Pau Ruiz Safont <[email protected]>
It's already printed by xenopsd, and now that development has stabilised,
unit-test can print this useful unformation.

Signed-off-by: Pau Ruiz Safont <[email protected]>
…culation

Now unreachable nodes are not considered when calculating all the subsets for
the NUMA nodes combinations for scheduling a domain.

Signed-off-by: Pau Ruiz Safont <[email protected]>
@psafont psafont force-pushed the private/paus/numaybe branch from e2d7d75 to 9badc14 Compare February 3, 2025 16:40
@psafont
Copy link
Member Author

psafont commented Feb 3, 2025

Rebased on top of latest master and made the limit in gen_candidates depend on the number of reachable nodes, ignoring the unreachable ones, as well as logging the situation.

@stormi
Copy link
Contributor

stormi commented Feb 3, 2025

Do you want another user test based on the latest iteration?

@psafont
Copy link
Member Author

psafont commented Feb 3, 2025

I don't think it's needed, thanks

@psafont psafont added this pull request to the merge queue Feb 3, 2025
Merged via the queue into xapi-project:master with commit 3234be2 Feb 3, 2025
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Buggy or uncommon ACPI tables break xenopsd-xc startup (and thus XAPI's startup)
5 participants